The addition to noweb described here is based on the following two premises
- It should be as independent of the target language as possible, and
- We don't want to write a full-blown scanner for the target language.
Strings of characters of the target language which we want to typeset specially are called
``interesting tokens''. Having had some experience with Web and SpiderWeb, we define three
categories of interesting tokens:
- Reserved words of the target language: we want to typeset them in bold, say.
- Other strings that we want to typeset specially: e.g. ≤ for [[<=]].
- Comment and quoting characters: we want what follows them or what is enclosed by
them to be typeset literally.
There is a table [[translation]] which defines a translation into TEX code for every
interesting token in the target language. Here is an excerpt from the translation table
for Mathematica:
[[translation["Block"] := "
CbBlock"]] |
:= "
CbBreak"]] |
:= " |
ge"]] |
:= " |
land"]] |
(Here the control sequence \Cb
selects the Courier bold font2.) We use four sets of strings to define the tokens in categories 2 and 3:
[[special]], [[comment1]], [[comment2]], [[quote]].
[[comment1]] is for unbalanced comment strings (none in Mathematica), [[comment2]]
is for balanced comment strings (here (*
and *)
), and [[quote]] is for
literal quotes ([["]]), which we assume to be balanced.
Our approach to recognizing the interesting tokens while scanning a line, is to have a set
of characters [[begin_token]] (an Icon cset), containing all the characters by which an
interesting token may begin. [[begin_token]] is the union of
- the cset defining the characters which may begin a reserved word, and
- the cset containing the initial characters of all strings in the special, comment,
and quote sets.
Given a line of text, we scan up to a character in [[begin_token]], and, depending on what
this character is, we may try to complete the token by further scanning. If we succeed,
we look up the token in the [[translation]] table, and if the token is found, we output
its translation, otherwise we output the token itself unchanged. When comment or quote
tokens are recognized, further processing of the line may stop altogether, or temporarily,
until a matching token is found.
@
«Procedure [[main]]»=
procedure main (args)
«The [[translation]] table»
«Definition of interesting tokens»
«Emit special TEX definitions»
«Read and filter all the input lines»
end
@